This document is a demonstration of how to use EBImage to quanitfy patterns in microscopy data, in particular from stiched, multi-channel, raw EVOS images, with each channel saved as a separate large stiched image. You will need the images themselves, the knowledge of which image is which color channel (since they are all greyscale), and the proteins or molecules represented in each channel. It will also be useful to know what combination of positive-markers and negative markers are of interest for you specific project and/or experiment.
# Packages (what software do we need)
require(EBImage)
require(reshape2)
require(ggplot2)
require(Seurat)
require(ggpubr)
require(knitr)
# Load local quantification functions (homemade functions that make this script run)
source('/Users/msbr/GitHub/ImageQuant/R Functions/LoadImageChannels.R')
source('/Users/msbr/GitHub/ImageQuant/R Functions/PreProcessImage.R')
source('/Users/msbr/GitHub/ImageQuant/R Functions/QuantFunction.R')
source('/Users/msbr/GitHub/ImageQuant/R Functions/QuantifyOneLabel.R')
source('/Users/msbr/GitHub/ImageQuant/R Functions/ChunkIt.R')
source('/Users/msbr/GitHub/ImageQuant/R Functions/MeltIntoDataframe.R')
source('/Users/msbr/GitHub/ImageQuant/R Functions/NormalizeImages.R')
Great. Not we are ready to actually process some data. Let’s use Satoshi’s data comparing Sox9 media with matrigel additive to Sox9 media without matrigel additive.
We will run the process, for each image to be analyzed, indepedently, and then combine the outputs later.
Here, we load the Matrigel-negative condition:
#### Matrigel - ####
# Load image channels
images.raw <- LoadImageChannels(channel.filepaths =
c("~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m/101323_BASC isolation #1_P0(P1)_SOX9m_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d0.TIF",
"~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m/101323_BASC isolation #1_P0(P1)_SOX9m_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d1.TIF",
"~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m/101323_BASC isolation #1_P0(P1)_SOX9m_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d2.TIF",
"~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m/101323_BASC isolation #1_P0(P1)_SOX9m_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d3.TIF"),
channel.names =
c('Abca3','DAPI','Vim','Sox9'))
We can inspect the image directly:
display(images.raw$DAPI)
Can’t see anything yet! That’s because we need to normalize the data so that the signal shows up as a visible shade of white over a black backgroun:
# Normalize each channel
images.norm <- NormalizeImages(images.raw)
display(images.norm$DAPI)
display(images.norm$Sox9)
Great. Now we can see our images. But they have ranges of values, and we really would prefer binary values at some threshold. But how do we determine the right threshold? Here, we use a technique called ‘adaptive thresholding’, which comes built-in to EBImage. You should play around with the ‘box.dim’ and ‘offset’ parameters to get a picture that looks good for you! These settings should be standardized for any set of samples that will be compared to one another.
# PreProcess image channels
images.processed <- PreProcessImage(images.norm = images.norm,
box.dim = 1000,
offset = 0.01) # These parameters will make a huge difference in your outputs and should be tuned + checked
display(images.processed$DAPI)
display(images.processed$Sox9)
display(images.processed$Abca3)
display(images.processed$Vim)
Excellent. Now we will bring all of the information from each channel together into one single dataframe, whih will allow us to perform logical operations on each row (pixel, in this case)
# Melt into dataframe
image.object <- MeltIntoDataframe(images.processed)
Next, we want to bin the data into chunks. Basicaly, take the big image and break it into little pieces, so that we can understand the statisical ‘spread’ of the outputs and get distributions as our end result. You should experiment with ‘num.pixels.per.chunk’ values to get a good number of output measurements that make sense to you and allow you to see a good amount of varince in your data without measuring millions of values.
# Define number of pixels within each chunk
num.pixels.per.chunk = 500000 # experiment with chunk size to yield a good number of samples
num.pix <- nrow(image.object) # total number of pixels
n.chunks <- floor(num.pix/num.pixels.per.chunk)
message(paste(num.pixels.per.chunk,'pixels per chunk will yield',n.chunks,'total chunks'))
Once you are happy with the above values, we break our data into that many chunks, as follows:
chunked.data <- ChunkIt(image.object = image.object,
n.chunks = n.chunks)
Great! Next, we need to identify specific metrics we would like to evaluate.
This means that we, as the user, need to DEFINE what we are asking to measure.
Example: Let’s say that we want to ask what fraction of these pixels are BASC-like. We need to tell the code how to identify BASC-like pixels, and what to compare these to within the image (because our output will be a ratio/percentage.) I think that a good definition of a BASC-like pixel is a pixel that is Sox9-positive, Abca3-positive, and Vimentin-negative. But I don’t want to compare the representation of this pattern to ALL pixels. I want to compare this signature within the limited set of DAPI-positive signals. So, I might write the following:
quick.demo <- QuantifyOneLabel(chunked.data,
label.name = 'BASC-like Epithelium',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Sox9','Abca3'),
label.criteria.negative = c('Vim'))
But that only quantifies one thing! We would love to quantify multiple things at once and then combine all the results together for plotting. So we can stack these queries, as follows:
# Build the entire dataset needed for plotting
output.data <- data.frame('BASC-like Epithelium' = QuantifyOneLabel(chunked.data,
label.name = 'BASC-like Epithelium',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Sox9','Abca3'),
label.criteria.negative = c('Vim')),
'Tuft-like Epithelium' = QuantifyOneLabel(chunked.data,
label.name = 'Tuft-like Epithelium',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Sox9'),
label.criteria.negative = c('Vim','Abca3')),
'Sox9-Positive Epithelium' = QuantifyOneLabel(chunked.data,
label.name = 'Sox9-Positive Epithelium',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Sox9'),
label.criteria.negative = c('Vim')),
'Mesenchyme' = QuantifyOneLabel(chunked.data,
label.name = 'Mesenchyme',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Vim'),
label.criteria.negative = c('Sox9','Abca3')),
'Epi.Tuft-like' = QuantifyOneLabel(chunked.data,
label.name = 'Mesenchyme',
base.criteria.positive = c('DAPI'),
base.criteria.negative = c('Vim'),
label.criteria.positive = c('Sox9'),
label.criteria.negative = c('Abca3')))
# Label with condition
output.data$Condition <- 'Matrigel-'
# Stash so we don't overwrite
data1 <- melt(output.data)
The object ‘data1’ now contains all the information that we requested from the Matrigel-negative image. We can take a direct loook at data1 by running the following code:
#View(data1)
kable(data1[1:50,])
| Condition | variable | value |
|---|---|---|
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | NA |
| Matrigel- | BASC.like.Epithelium | 8.6044390 |
| Matrigel- | BASC.like.Epithelium | 6.2655508 |
| Matrigel- | BASC.like.Epithelium | 0.0000000 |
| Matrigel- | BASC.like.Epithelium | 0.3237743 |
| Matrigel- | BASC.like.Epithelium | 0.5119953 |
| Matrigel- | BASC.like.Epithelium | 4.4277494 |
| Matrigel- | BASC.like.Epithelium | 4.6582944 |
| Matrigel- | BASC.like.Epithelium | 3.7056413 |
| Matrigel- | BASC.like.Epithelium | 1.4042955 |
| Matrigel- | BASC.like.Epithelium | 2.1997232 |
| Matrigel- | BASC.like.Epithelium | 2.0526551 |
| Matrigel- | BASC.like.Epithelium | 2.7506310 |
| Matrigel- | BASC.like.Epithelium | 3.4263575 |
| Matrigel- | BASC.like.Epithelium | 5.6325088 |
| Matrigel- | BASC.like.Epithelium | 2.3594937 |
| Matrigel- | BASC.like.Epithelium | 7.7093272 |
| Matrigel- | BASC.like.Epithelium | 9.0794358 |
| Matrigel- | BASC.like.Epithelium | 2.8582174 |
| Matrigel- | BASC.like.Epithelium | 1.4758312 |
| Matrigel- | BASC.like.Epithelium | 1.8681829 |
| Matrigel- | BASC.like.Epithelium | 7.1916532 |
| Matrigel- | BASC.like.Epithelium | 3.0139935 |
| Matrigel- | BASC.like.Epithelium | 4.0732054 |
| Matrigel- | BASC.like.Epithelium | 0.9835906 |
| Matrigel- | BASC.like.Epithelium | 2.3808253 |
| Matrigel- | BASC.like.Epithelium | 4.5719881 |
| Matrigel- | BASC.like.Epithelium | 3.1916626 |
| Matrigel- | BASC.like.Epithelium | 0.6478482 |
| Matrigel- | BASC.like.Epithelium | 3.4832235 |
| Matrigel- | BASC.like.Epithelium | 3.5730053 |
| Matrigel- | BASC.like.Epithelium | 6.4639739 |
| Matrigel- | BASC.like.Epithelium | 3.7464364 |
| Matrigel- | BASC.like.Epithelium | 3.0261496 |
| Matrigel- | BASC.like.Epithelium | 1.7299201 |
| Matrigel- | BASC.like.Epithelium | 3.2601973 |
| Matrigel- | BASC.like.Epithelium | 3.5428606 |
| Matrigel- | BASC.like.Epithelium | 1.9517636 |
| Matrigel- | BASC.like.Epithelium | 1.2214722 |
| Matrigel- | BASC.like.Epithelium | 0.6652306 |
| Matrigel- | BASC.like.Epithelium | 1.8988455 |
| Matrigel- | BASC.like.Epithelium | 1.2092313 |
Excellent. Looks good. Now, let’s run exactly the same workflow for the Matrigel-positive condition:
#### Matrigel + ####
# Load image channels
images.raw <- LoadImageChannels(channel.filepaths =
c("~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m+Matrigel/101323_BASC isolation #1_P0(P1)_SOX9m+Matrigel_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d0.TIF",
"~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m+Matrigel/101323_BASC isolation #1_P0(P1)_SOX9m+Matrigel_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d1.TIF",
"~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m+Matrigel/101323_BASC isolation #1_P0(P1)_SOX9m+Matrigel_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d2.TIF",
"~/Library/CloudStorage/GoogleDrive-michasam.raredon@yale.edu/My Drive/Raredon_Lab_Administration/Lab Members/Satoshi/EVOS images_101323/BASC isolation#1_P0(P1)_SOX9m+Matrigel/101323_BASC isolation #1_P0(P1)_SOX9m+Matrigel_ABCA3.488_Vimentin.555_SOX9.647_Bottom Slide_TR_p00_0_A01f00d3.TIF"),
channel.names = c('Abca3','DAPI','Vim','Sox9'))
display(images.raw$DAPI)
# Normalize each channel
images.norm <- NormalizeImages(images.raw)
display(images.norm$DAPI)
# PreProcess image channels
images.processed <- PreProcessImage(images.norm = images.norm,
box.dim = 1000,
offset = 0.01)
display(images.processed$DAPI)
display(images.processed$Sox9)
display(images.processed$Abca3)
display(images.processed$Vim)
# Melt into dataframe
image.object <- MeltIntoDataframe(images.processed)
# Define number of pixels within each chunk
num.pixels.per.chunk = 500000 # experiment with chunk size to yield a good number of samples
num.pix <- nrow(image.object) # total number of pixels
n.chunks <- floor(num.pix/num.pixels.per.chunk)
message(paste(num.pixels.per.chunk,'pixels per chunk will yield',n.chunks,'total chunks'))
chunked.data <- ChunkIt(image.object = image.object,
n.chunks = n.chunks)
# Build the entire dataset needed for plotting
output.data <- data.frame('BASC-like Epithelium' = QuantifyOneLabel(chunked.data,
label.name = 'BASC-like Epithelium',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Sox9','Abca3'),
label.criteria.negative = c('Vim')),
'Tuft-like Epithelium' = QuantifyOneLabel(chunked.data,
label.name = 'Tuft-like Epithelium',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Sox9'),
label.criteria.negative = c('Vim','Abca3')),
'Sox9-Positive Epithelium' = QuantifyOneLabel(chunked.data,
label.name = 'Sox9-Positive Epithelium',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Sox9'),
label.criteria.negative = c('Vim')),
'Mesenchyme' = QuantifyOneLabel(chunked.data,
label.name = 'Mesenchyme',
base.criteria.positive = c('DAPI'),
label.criteria.positive = c('Vim'),
label.criteria.negative = c('Sox9','Abca3')),
'Epi.Tuft-like' = QuantifyOneLabel(chunked.data,
label.name = 'Mesenchyme',
base.criteria.positive = c('DAPI'),
base.criteria.negative = c('Vim'),
label.criteria.positive = c('Sox9'),
label.criteria.negative = c('Abca3')))# Label with condition
output.data$Condition <- 'Matrigel+'
# Stash so we don't overwrite
data2 <- melt(output.data)
Ok great. Now we have all the information we want, from both conditions. But we want to compare them together in unified plots! So we need to bring these two data objects together. We can do this as follows, using the ‘rbind’ command (‘row-bind’) which concatenates data-frames in R.
# Bind two datasets together for plotting
data <- rbind(data1,data2)
Then, we are all set to begin exploring and making various plots using ggplot. These are my best efforts, but there are a lot of options and you should experiment with new and optimal ways of using ggplot to visualize these data.
First, let’s just set up a basic ggplot object that start us looking at the data
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()
This is not bade! But we’d like to see all of our measurements, in addition to the distributions. So, let’s add a geom_point
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(color='black',
size=0.25)
Ok, so it is an issue that all of our points are at a single x-coordinate in each condition. No good. So, following some Googling, we figure out that this should work:
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(color='black',
size=0.25,
position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25))
Very nice! Let’s get rid of the silly looking grey background:
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(color='black',
size=0.25,
position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25))+
theme_classic()
I like that a lot better. Let’s label the y-axis:
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(color='black',
size=0.25,
position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25))+
theme_classic()+
ylab('Fraction of DAPI Pixels')
And add a title:
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(color='black',
size=0.25,
position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25))+
theme_classic()+
ylab('Fraction of DAPI Pixels')+
ggtitle('Effect of Matrigel on Epithelial Stemness in 2D Culture')
And update the legend title:
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(color='black',
size=0.25,
position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25))+
theme_classic()+
ylab('Fraction of DAPI Pixels')+
ggtitle('Effect of Matrigel on Epithelial Stemness in 2D Culture')+
guides(fill=guide_legend(title="Expression Signature"),
color=guide_legend(title="Expression Signature")) # We need to do this twice, because I have told it to use both the 'color' (the violin outline) and 'fill' (the violin fill) attributes
Finally, let’s update the colors, so that we can use this plot for publication (ggplot default colors should not be used for publication, almost ever, unless that is specifically a stated goal for some reason, like emphasizing that we are deliberately using a default output or classification without modification. Not applicable here, so let’s put in some fancy colors to make us look good).
ggplot(data=data,
aes(x=Condition, y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(color='black',
size=0.25,
position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25))+
theme_classic()+
ylab('Fraction of DAPI Pixels')+
ggtitle('Effect of Matrigel on Epithelial Stemness in 2D Culture')+
guides(fill=guide_legend(title="Expression Signature"),
color=guide_legend(title="Expression Signature"))+
scale_fill_manual(values = c('#93B7BE','#F19A3E','#3D3B8E','#E072A4','red'))+
scale_color_manual(values = c('#93B7BE','#F19A3E','#3D3B8E','#E072A4','red')) # Again, need to modify BOTH the color and fill arguments, if we want this to look clean
Great. Very happy with this. We can now save this plot to our hardrive, if we want, as follows (this does a PNG file, which I like because it is a raster image that can be made very high-resoltion, but is still displayed easily by most programs. PDF also works, but generates very large files sometimes that some programs struggle to render. Publication-quality figures must ALWAYS be made with a resolution of at least 300 dots-per-inch (dpi) which is covered here by the combination of units being ‘in’ (inches) and res (‘resolution’) being ‘300’. The width and height should be manipulated to yield the desired aspect ratio and font sizing. (Font sizing can also be modified directly, within ggplot, but I am not demonstrating that here.)
png('Test.Output.Demo.png',width = 7,height = 5,units='in',res=300)
ggplot(data = data,
aes(x = Condition,y=value,fill=variable,color=variable))+
geom_violin()+
geom_point(position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25),size=0.1,color='black')+
theme_classic()+
ggtitle('Effect of Matrigel on Epithelial Stemness in 2D Culture')+
ylab('Fraction of DAPI+ Pixels')+
scale_fill_manual(values = c('#93B7BE','#F19A3E','#3D3B8E','#E072A4','red'))+
scale_color_manual(values = c('#93B7BE','#F19A3E','#3D3B8E','#E072A4','red'))
dev.off()
## quartz_off_screen
## 2
Finally, note that the organization that I am showing here is by no means mandatory. Maybe we think the graph would be clearer if we group the colors tgoether so that we can compare the distribution of each metric between conditions, quickly. You can change the way ggplot is running to achieve this. Here, I switch the order around, change the color scheme, add an x-axis global label, and rotate the x-axis category labels to try to make a nice looking plot:
ggplot(data = data,
aes(x = variable,y=value,fill=Condition,color=Condition))+
geom_violin()+
geom_point(position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25),size=0.1,color='black')+
theme_classic()+
ggtitle('Effect of Matrigel on Epithelial Stemness in 2D Culture')+
ylab('Fraction of DAPI+ Pixels')+
xlab('Expression Signature')+
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+
scale_fill_manual(values = c('blue3','red3'))+
scale_color_manual(values = c('blue3','red3'))
Looks good! Then we can save to hardrive if we want to:
png('Test.Output.Demo.Reorganized.png',width = 7,height = 5,units='in',res=300)
ggplot(data = data,
aes(x = variable,y=value,fill=Condition,color=Condition))+
geom_violin()+
geom_point(position = position_jitterdodge(dodge.width = 0.9,jitter.width=0.25),size=0.1,color='black')+
theme_classic()+
ggtitle('Effect of Matrigel on Epithelial Stemness in 2D Culture')+
ylab('Fraction of DAPI+ Pixels')+
xlab('Expression Signature')+
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+
scale_fill_manual(values = c('blue3','red3'))+
scale_color_manual(values = c('blue3','red3'))
dev.off()
## quartz_off_screen
## 2
Let me know if questions! Feel free to modify this markdown document, borrow and evolve code, etc to suit your purposes.
We also may want to modify the code evenutally to operate on a cell-level (i.e., leveraging image parcellation into ‘cells’ in the initial steps,) rather than using pixels, but I’m not sure how to do this yet. Does anyone have ideas?